Reveal Entities from Texts with a Hybrid Approach
نویسندگان
چکیده
The tasks of entity extraction, recognition, and linking are largely affected by the nature of the textual documents being analyzed. In fact, a lot of research efforts have focused on improving each task for both formal text (such as newswire documents) and for informal text (such as tweets). In this work, we propose a so-called hybrid approach that aims to be agnostic of the document type. Two datasets, namely the #Micropost2014 NEEL corpus and the OKE2015 test dataset, are used to benchmark the performance of our approach. The experimental results show that the approach presented in this paper outperforms the state-of-the-art systems on OKE2015 dataset and provides good results for the #Micropost2014 dataset.
منابع مشابه
An Experimental Study of a Hybrid Entity Recognition and Linking System
We present an experimental study of the performance of a hybrid semantic and linguistic system for recognizing and linking entities from formal and informal texts. In the current literature, systems are generally tailored to one or a few types of textual documents (e.g. narrative texts, newswire articles, informal text such as microposts). In contrast, we assess the performance of a hybrid appr...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملA hybrid approach for extracting semantic relations from texts
We present an approach for extracting relations from texts that exploits linguistic and empirical strategies, by means of a pipeline method involving a parser, partof-speech tagger, named entity recognition system, pattern-based classification and word sense disambiguation models, and resources such as ontology, knowledge base and lexical databases. The relations extracted can be used for vario...
متن کامل